Learning from Inconsistent and Noisy Data: The AQ18 Approach

نویسندگان

  • Kenneth A. Kaufman
  • Ryszard S. Michalski
چکیده

In concept learning or data mining tasks, the learner is typically faced with a choice of many possible hypotheses characterizing the data. If one can assume that the training data are nois e-free, then the generated hypothesis should be complete and consistent with regard to the data. In real -world problems, however, data are often noisy, and an insistence on full completeness and consistency is no longer valid. The proble m then is to determine a hypothesis that represents the “best” trade -off between completeness and consistency. This paper presents an approach to this problem in which a learner seeks rules optimizing a description quality criterion that combines completeness and consistency gain , a measure based on consistency that reflects the rule ’s benefit . The method has been implemented in the AQ18 learning and data mining system and compared to several other methods. Experiments have indicated the flexibility and power of the proposed method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning in an Inconsistent World: Rule Selection in Aq18 Learning in an Inconsistent World Rule Selection in Star/aq18 Learning in an Inconsistent World: Rule Selection in Star/aq18

In concept learning and data mining tasks, the learner is typically faced with a choice of many possible hypotheses generalizing the input data. If one can assume that training data contains no noise, then the primary conditions a hypothesis must satisfy are consistency and completeness with regard to the data. In real-world applications, however, data are often noisy, and the insistence on the...

متن کامل

Damage identification of structures using second-order approximation of Neumann series expansion

In this paper, a novel approach proposed for structural damage detection from limited number of sensors using extreme learning machine (ELM). As the number of sensors used to measure modal data is normally limited and usually are less than the number of DOFs in the finite element model, the model reduction approach should be used to match with incomplete measured mode shapes. The second-order a...

متن کامل

Evaluating the Impact of Coder Errors on Active Learning

Active Learning (AL) has been proposed as a technique to reduce the amount of annotated data needed in the context of supervised classification. While various simulation studies for a number of NLP tasks have shown that AL works well on goldstandard data, there is some doubt whether the approach can be successful when applied to noisy, real-world data sets. This paper presents a thorough evalua...

متن کامل

A Multi-Objective Approach to Fuzzy Clustering using ITLBO Algorithm

Data clustering is one of the most important areas of research in data mining and knowledge discovery. Recent research in this area has shown that the best clustering results can be achieved using multi-objective methods. In other words, assuming more than one criterion as objective functions for clustering data can measurably increase the quality of clustering. In this study, a model with two ...

متن کامل

The effects of traffic noise on memory and auditory-verbal learning in Persian language children

Background: Acoustic noise is one of the universal pollutants of modern society. Although the high level of noise adverse effects on human hearing has been known for many years, non-auditory effects of noise such as effects on cognition, learning, memory and reading, especially on children, have been less considered. Factors which have negative impact on these features can also have a negat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999